Robust supervised learning with coordinate gradient descent

نویسندگان

چکیده

This paper considers the problem of supervised learning with linear methods when both features and labels can be corrupted, either in form heavy tailed data and/or corrupted rows. We introduce a combination coordinate gradient descent as algorithm together robust estimators partial derivatives. leads to statistical that have numerical complexity nearly identical non-robust ones based on empirical risk minimization. The main idea is simple: while requires computational cost robustly estimating whole update all parameters, parameter updated immediately using estimator single derivative descent. prove upper bounds generalization error algorithms derived from this idea, control optimization errors without strong convexity assumption risk. Finally, we propose an efficient implementation approach new Python library called linlearn, demonstrate through extensive experiments our introduces interesting compromise between robustness, performance efficiency for problem.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Block Coordinate Descent

In this paper we present a novel randomized block coordinate descent method for the minimization of a convex composite objective function. The method uses (approximate) partial second-order (curvature) information, so that the algorithm performance is more robust when applied to highly nonseparable or ill conditioned problems. We call the method Robust Coordinate Descent (RCD). At each iteratio...

متن کامل

Learning Structured Classifiers with Dual Coordinate Descent

We present a unified framework for online learning of structured classifiers. This framework handles a wide family of convex loss functions that includes as particular cases CRFs, structured SVMs, and the structured perceptron. We introduce a new aggressive online algorithm that optimizes any loss in this family; for the structured hinge loss, this algorithm reduces to 1-best MIRA; in general, ...

متن کامل

Learning Output Kernels with Block Coordinate Descent

We propose a method to learn simultaneously a vector-valued function and a kernel between its components. The obtained kernel can be used both to improve learning performance and to reveal structures in the output space which may be important in their own right. Our method is based on the solution of a suitable regularization problem over a reproducing kernel Hilbert space of vector-valued func...

متن کامل

Learning to learn by gradient descent by gradient descent

The move from hand-designed features to learned features in machine learning has been wildly successful. In spite of this, optimization algorithms are still designed by hand. In this paper we show how the design of an optimization algorithm can be cast as a learning problem, allowing the algorithm to learn to exploit structure in the problems of interest in an automatic way. Our learned algorit...

متن کامل

Large Scale Semi - supervised Linear SVM with Stochastic Gradient Descent ⋆

Semi-supervised learning tries to employ a large collection of unlabeled data and a few labeled examples for improving generalization performance, which has been proved meaningful in real-world applications. The bottleneck of exiting semi-supervised approaches lies in over long training time due to the large scale unlabeled data. In this article we introduce a novel method for semi-supervised l...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Statistics and Computing

سال: 2023

ISSN: ['0960-3174', '1573-1375']

DOI: https://doi.org/10.1007/s11222-023-10283-7